Supplementary Materials Overview
This supplement provides detailed methods, additional results, and reproducibility information for the main manuscript. All items are cross-referenced to specific sections of the main text.
Contents:
This section provides extended methodological details for readers seeking to understand or replicate our analyses. Each subsection corresponds to a specific aspect of the main text Methods section.
Main text reference: Methods → Study System (page 3)
Surveys were conducted at three reef sites representing distinct reef environments around Mo’orea, French Polynesia (17°30’S, 149°50’W) during June–August 2019. Sites were selected to capture environmental heterogeneity that might influence cryptofauna community assembly.
| Site | Code | Reef Type | Coordinates | Depth | Environment | Key Features |
|---|---|---|---|---|---|---|
| Hauru | HAU | Fringing | -17.4833°, -149.8667° | 1.2–4.8 m | Moderate exposure | High water circulation, diverse coral cover |
| Maatea | MAT | Back-reef lagoon | -17.5333°, -149.8000° | 2.1–5.9 m | Sheltered | Calm conditions, high sedimentation |
| Barrier Reef | MRB | Barrier reef crest | -17.4667°, -149.8333° | 3.2–7.6 m | High energy | Oceanic swells, high coral density |
Why these sites matter: The three sites represent a gradient from sheltered lagoon (MAT) to exposed barrier reef (MRB), allowing us to test whether cryptofauna assembly rules generalize across environmental contexts or depend on local conditions. This design directly supports our third research question about reef environment effects on community composition.
Reproducibility: Site coordinates and
characteristics are in
data/survey_coral_characteristics_merged_v2.csv. Map
generated by scripts/03_spatial_patterns.R.
Main text reference: Methods → Coral and Fauna Sampling (page 3); supports all scaling analyses
Accurate volume estimation is critical because our central hypothesis (propagule dilution) predicts that fauna abundance scales sublinearly with habitat volume.
Volume Calculation Protocol
Colony volume was calculated using the hemi-ellipsoid approximation:
\[V = \frac{2}{3} \pi \times r_1 \times r_2 \times h\]
Where: - \(r_1\) = semi-major axis (half of maximum diameter) - \(r_2\) = semi-minor axis (half of perpendicular diameter) - \(h\) = colony height
Validation: This method was validated against water displacement measurements for 25 Pocillopora colonies (R² = 0.94, slope = 0.98), confirming that the ellipsoid approximation closely matches true volume.
Range of colony sizes: Our sampling captured colonies spanning three orders of magnitude in volume (12–18,400 cm³), providing statistical power to detect scaling relationships.
Reproducibility: Volume calculations in
scripts/01_load_clean_data.R (lines 45–60). Raw
measurements in
data/survey_coral_characteristics_merged_v2.csv.
Main text reference: Methods → Coral and Fauna Sampling (page 3)
Complete extraction of cryptofauna is essential for accurate community characterization. Our protocol was designed to maximize recovery of mobile and sessile organisms from the complex three-dimensional structure of branching corals.
Extraction Protocol:
Identification: Specimens sorted to operational taxonomic units (OTUs) based on morphological characters under dissecting microscope (10–40× magnification). The 243 OTUs include 87 identified to species level, with the remainder identified to genus or family.
Limitations: Molecular barcoding was not performed; cryptic species may be conflated within morphological OTUs. However, community-level patterns should be robust to moderate identification uncertainty.
Reproducibility: Fauna data in
data/survey_cafi_data_w_taxonomy_summer2019_v5.csv.
Taxonomic summaries generated by
scripts/02_community_composition.R.
Main text reference: Results → Result 4: Coral Condition Independent of Size (page 6); Methods → Statistical Analyses (implicit)
This methodological innovation addresses a systematic bias in coral physiology sampling that could confound condition–diversity relationships.
The Problem
Physiological measurements were taken from branch tips at varying positions on the colony. Sampling position (measured as “stump length”—distance from colony base to sampling point) correlates strongly with colony size:
The Solution: Residual Analysis
For each physiological trait (protein, carbohydrate, zooxanthellae density, AFDW):
trait ~ stump_lengthCondition Score = PC1 of position-corrected traits (~60% variance explained)
Validation: Corrected condition score shows |r| < 0.10 with colony volume (vs. r = 0.42 for uncorrected score), confirming successful removal of size confound.
Why this matters: Without position correction, any relationship between fauna diversity and coral “condition” could simply reflect the shared correlation of both variables with colony size. Our correction isolates true condition variation from size effects.
Reproducibility: Position correction implemented in
scripts/05a_coral_characteristics.R (lines 120–180).
Validation plots in Figure S3–S4.
Main text reference: Methods → Statistical Analyses (page 4)
This section provides complete model specifications for readers wishing to replicate our analyses or understand the statistical framework.
Purpose: Test whether fauna abundance scales sublinearly with coral volume (supporting propagule dilution hypothesis).
log10(Abundance) ~ log10(Volume) + Branch_Width + (1 + log10(Volume) | Site)
Purpose: Partition variation in community composition among site, volume, and coral characteristics.
Purpose: Identify non-random species associations and keystone taxa.
Reproducibility: All models in
scripts/05_coral_cafi_relationships.R (scaling),
scripts/04_diversity_analysis.R (PERMANOVA),
scripts/06_network_analysis.R (networks).
All tables provide detailed statistical results that support findings described in the main text. Each table indicates the specific main text section it supports.
Supports: Main text Methods → Study System; provides sample sizes for all analyses
This table summarizes the sampling effort and basic community metrics for each site, providing context for interpreting site-specific results.
| Site | N Corals | CAFI Individuals | Species (OTUs) | Mean Volume (cm³) | Depth Range (m) |
|---|---|---|---|---|---|
| HAU | 38 | 4,234 | 62 | 1,245 ± 892 | 1.2–4.8 |
| MAT | 39 | 5,102 | 71 | 1,567 ± 1,102 | 2.1–5.9 |
| MRB | 35 | 3,498 | 58 | 1,089 ± 756 | 3.2–7.6 |
| Total | 112 | 12,834 | 87 unique | 1,312 ± 945 | 1.2–7.6 |
Reproducibility: Generated by
scripts/01_load_clean_data.R. Source data:
data/survey_coral_characteristics_merged_v2.csv.
Supports: Main text Results → Result 1: Reef Environment Structures Community Composition (page 5)
This table presents the full PERMANOVA results showing how much community variation is explained by each factor. The dominant site effect (R² = 0.106) supports our finding that reef environment shapes cryptofauna communities more strongly than coral morphology.
| Term | Df | SS | R² | F | p |
|---|---|---|---|---|---|
| Site | 2 | 2.847 | 0.106 | 4.31 | 0.001 |
| log(Volume) | 1 | 1.234 | 0.046 | 3.74 | 0.002 |
| Branch Width | 1 | 0.567 | 0.021 | 1.72 | 0.078 |
| Depth | 1 | 0.423 | 0.016 | 1.28 | 0.198 |
| Site × Volume | 2 | 0.312 | 0.012 | 0.94 | 0.456 |
| Residual | 104 | 21.456 | 0.799 | — | — |
| Total | 111 | 26.839 | 1.000 | — | — |
Interpretation for naive readers: PERMANOVA (Permutational Multivariate Analysis of Variance) partitions variation in community composition among predictor variables. R² indicates the proportion of total variation explained by each factor. Here, Site explains 10.6% of variation—more than any coral-level characteristic—indicating that reef environment filters community composition.
Reproducibility: Generated by
scripts/04_diversity_analysis.R (lines 200–250). Output
saved to output/tables/permanova_results.csv.
Supports: Main text Results → Result 2: Sublinear Scaling Reveals Propagule Dilution (page 5)
This table presents the full model output for the scaling analysis, including random effects variance. The scaling exponent β = 0.46 is significantly less than 1.0, supporting the propagule dilution hypothesis.
| Effect | Estimate | SE | 95% CI | z / p |
|---|---|---|---|---|
| Fixed Effects | ||||
| Intercept | 1.234 | 0.156 | [0.93, 1.54] | 7.91 / <0.001 |
| log(Volume) | 0.458 | 0.038 | [0.38, 0.53] | 12.05 / <0.001 |
| Branch Width [wide] | 0.287 | 0.089 | [0.11, 0.46] | 3.22 / 0.001 |
| Random Effects | ||||
| Site (Intercept) SD | 0.484 | — | — | — |
| Site (Slope) SD | 0.110 | — | — | — |
| Residual SD | 0.876 | — | — | — |
Interpretation for naive readers: The scaling exponent (β = 0.46) describes how fauna abundance changes with coral volume. A value of 1.0 would indicate isometric scaling (doubling volume doubles abundance). Our value of 0.46 means fauna density decreases in larger corals—a 10-fold volume increase yields only a 3.1-fold abundance increase—consistent with limited propagule supply being spread across more habitat.
Reproducibility: Generated by
scripts/05_coral_cafi_relationships.R (lines 80–150). Model
object saved to output/objects/scaling_glmm.rds.
Supports: Main text Results → Result 2, Discussion → Propagule Dilution (pages 5, 7)
This table shows that sublinear scaling is consistent across all three reef sites, strengthening the generality of our findings.
| Site | β (Exponent) | SE | 95% CI | p (β < 1) | R² |
|---|---|---|---|---|---|
| HAU (Fringing) | 0.52 | 0.08 | [0.36, 0.68] | <0.001 | 0.54 |
| MAT (Lagoon) | 0.44 | 0.07 | [0.30, 0.58] | <0.001 | 0.62 |
| MRB (Barrier) | 0.50 | 0.08 | [0.34, 0.66] | <0.001 | 0.51 |
| Overall | 0.46 | 0.04 | [0.38, 0.53] | <0.001 | 0.58 |
Reproducibility: Site-specific models in
scripts/05_coral_cafi_relationships.R (lines 160–200).
Supports: Main text Results → Result 3: Network Structure Reveals Non-Random Assembly (page 6)
This table compares observed network properties to null model expectations, demonstrating that cryptofauna co-occurrence networks are significantly more structured than expected by chance.
| Metric | Observed | Null (mean ± SD) | z-score | p | Interpretation |
|---|---|---|---|---|---|
| Transitivity (clustering) | 0.28 | 0.07 ± 0.02 | 7.85 | <0.0001 | 3.8× higher clustering |
| Modularity (Q) | 0.52 | 0.08 ± 0.02 | 22.0 | <0.0001 | Non-random modules |
| Mean path length | 2.34 | 2.89 ± 0.12 | -4.58 | <0.001 | Shorter paths |
| Number of modules | 6 | — | — | — | Distinct species groups |
Interpretation for naive readers: Transitivity measures how often species that share a common associate also co-occur with each other. Our networks show 3.8× higher transitivity than expected by chance, meaning cryptofauna form cohesive groups of frequently co-occurring species—not random assemblages.
Reproducibility: Network analysis in
scripts/06_network_analysis.R. Metrics saved to
output/tables/cafi_network_metrics.csv.
Supports: Main text Results → Result 4, Discussion → Network Modules (pages 6, 7-8)
| Rank | Species | Functional Group | Degree | Betweenness | Module | Role |
|---|---|---|---|---|---|---|
| 1 | Alpheus diadema | Snapping shrimp | 12 | 260 | 5 | Hub/Connector |
| 2 | Alpheus collumianus | Snapping shrimp | 8 | 171 | 5 | Hub |
| 3 | Caracanthus maculatus | Coral croucher fish | 7 | 232 | 3 | Connector |
| 4 | Trapezia serenei | Guardian crab | 6 | 45 | 1 | Hub |
| 5 | Alpheus lottini | Snapping shrimp | 6 | 89 | 1 | Hub |
| 6 | Macrophiothrix longipeda | Brittle star | 5 | 12 | 2 | Peripheral |
Interpretation for naive readers: Structurally central species are those with many connections in the co-occurrence network. Alpheus diadema (snapping shrimp) has the highest “betweenness,” meaning it connects otherwise separate groups of species. However, high network centrality indicates structural importance—not functional “keystoneness,” which would require experimental removal to demonstrate.
Reproducibility: Centrality analysis in
scripts/06_network_analysis.R (lines 150–200). Full species
table in output/tables/cafi_keystone_species.csv.
Supports: Main text Results → Result 2: No Evidence for Diversity-Condition Relationship (page 6)
| Predictor | Estimate | SE | 95% CI | t | p |
|---|---|---|---|---|---|
| Colony volume (log) | −0.02 | 0.06 | [−0.14, 0.10] | −0.33 | 0.74 |
| Neighbor count | 0.04 | 0.04 | [−0.03, 0.11] | 1.10 | 0.27 |
| Site (HAU vs MAT) | −0.11 | 0.27 | [−0.64, 0.42] | −0.41 | 0.68 |
| Site (MRB vs MAT) | 0.19 | 0.28 | [−0.36, 0.74] | 0.68 | 0.50 |
Reproducibility: Condition models in
scripts/05a_coral_characteristics.R and
scripts/18_cafi_predicts_condition.R.
Supports: Main text Results → Result 2: No Evidence for Diversity-Condition Relationship (page 6)
This section provides the complete diagnostic analysis demonstrating that the apparent richness-condition relationship is a sampling artifact. When proper statistical corrections are applied, the effect disappears entirely.
Raw species richness is confounded with sampling effort because larger corals support more individuals, and more individuals yield more observed species regardless of any ecological relationship. Our diagnostic analysis quantifies this confound:
| Metric | Value | p-value | Interpretation |
|---|---|---|---|
| Abundance–Richness correlation | r = 0.813 | < 0.001 | Very strong positive—classic species-area effect |
| Richness variance explained by abundance | R² = 72.3% | < 0.001 | Most richness variation is sampling effort, not true diversity |
| Richness–Volume correlation | r = 0.668 | < 0.001 | Larger corals have higher richness (sampling artifact) |
| Condition–Volume correlation | r = −0.042 | 0.70 | Condition independent of volume (good) |
We applied four complementary approaches to control for sampling artifacts, all yielding null results:
| Diversity Metric | Definition | β (effect) | SE | p-value | Conclusion |
|---|---|---|---|---|---|
| Raw richness (naive) | Simple species count | +0.058 | 0.028 | 0.041 | Appears significant (ARTIFACT) |
| Raw richness + abundance | Richness controlling for # individuals | +0.069 | 0.042 | 0.104 | Effect disappears with abundance control |
| Rarefied richness (n=10) | Expected richness at standard sample size | −0.011 | 0.164 | 0.93 | NO EFFECT—sampling artifact removed |
| Residualized richness | Richness independent of abundance (regression residuals) | +0.055 | 0.040 | 0.17 | No pure diversity effect |
| Evenness (Pielou’s J) | Relative abundance distribution (abundance-independent) | −2.94 | 1.80 | 0.11 | No effect independent of richness |
| Shannon H’ | Information-theoretic diversity | +0.087 | 0.364 | 0.81 | No effect |
Statistical Details: Rarefaction and Residualization
Rarefaction: We calculated expected richness at a
standardized sample size of 10 individuals using
vegan::rarefy(). This controls for differential sampling
effort by asking: “How many species would we expect if we had sampled
exactly 10 individuals from each coral?” Only corals with ≥10
individuals (n = 68, 81% of sample) were included.
Residualization: We regressed raw richness on log(abundance) (R² = 0.72), then used the residuals as a measure of “pure diversity”—variation in richness independent of how many individuals were sampled. Corals with positive residuals have more species than expected for their abundance; negative residuals indicate fewer species than expected.
Evenness: Pielou’s J = H’/log(S), where H’ is Shannon diversity and S is richness. This metric is inherently independent of richness and measures how evenly individuals are distributed among species.
To test whether specific taxa drive any relationship, we examined individual species correlations with condition:
| Species | Prevalence | Correlation | p (raw) | p (FDR) | Interpretation |
|---|---|---|---|---|---|
| Hapalocarcinus (gall crab) | 6 corals | r = +0.27 | 0.013 | 0.36 | Only 2 species p < 0.05 raw |
| Periclimenes (cleaner shrimp) | 16 corals | r = +0.23 | 0.038 | 0.49 | Neither survives FDR correction |
| Fennera (pistol shrimp) | 26 corals | r = +0.20 | 0.062 | 0.49 | Marginal, not significant |
| Breviturma pica (snail) | 20 corals | r = +0.20 | 0.070 | 0.49 | Marginal, not significant |
| Trapezia tigrina (guard crab) | 11 corals | r = +0.15 | 0.160 | 0.67 | Not significant |
| Calcinus latens (hermit) | 25 corals | r = −0.15 | 0.167 | 0.67 | Not significant (negative) |
We tested whether removing any single species substantially changed the richness-condition correlation:
This confirms that the (non-significant) correlation is distributed across many species rather than driven by any particular taxon.
We tested whether any functional group showed elevated condition in species-rich assemblages:
| Functional Group | Richness–Condition r | p-value | Survives FDR? | Interpretation |
|---|---|---|---|---|
| Crabs (Brachyura) | +0.27 | 0.014 | No (p = 0.056) | Crab richness marginally associated; may reflect guardian crab presence |
| Shrimp (Caridea) | +0.04 | 0.73 | No | No effect |
| Fish | +0.07 | 0.52 | No | No effect |
| Snails (Gastropoda) | +0.09 | 0.44 | No | No effect |
The comprehensive diagnostic analysis demonstrates:
Biological interpretation: These results do not mean cryptofauna provide no benefits to corals—guardian crabs demonstrably protect their hosts in experimental studies. However, our observational data show no detectable relationship between community attributes (diversity or composition) and coral physiological condition.
Reproducibility: Full diagnostic analysis in
scripts/richness_condition_diagnostic.R. Output tables:
output/tables/richness_condition_diagnostic.csv,
output/tables/species_condition_correlations.csv,
output/tables/leave_one_species_out.csv.
Supports: Main text Results → Result 1 (page 5); provides diversity context
| Site | Richness (S) | Shannon (H’) | Simpson (1-D) | Evenness (J’) | Test |
|---|---|---|---|---|---|
| HAU | 14.2 ± 5.1 | 2.12 ± 0.45 | 0.81 ± 0.08 | 0.78 ± 0.12 | |
| MAT | 16.8 ± 6.2 | 2.34 ± 0.52 | 0.84 ± 0.07 | 0.81 ± 0.10 | |
| MRB | 12.9 ± 4.8 | 1.98 ± 0.41 | 0.79 ± 0.09 | 0.76 ± 0.13 | |
| Overall | 14.7 ± 5.5 | 2.15 ± 0.48 | 0.81 ± 0.08 | 0.78 ± 0.12 | KW p = 0.003 |
Reproducibility: Diversity calculations in
scripts/04_diversity_analysis.R. Output:
output/tables/alpha_diversity_metrics.csv.
Supports: Main text Methods → Statistical Analyses; validates model selection
| Model | Fixed Effects | AIC | ΔAIC | R²m | R²c |
|---|---|---|---|---|---|
| Volume only | 1 | 876.2 | 56.4 | 0.42 | 0.58 |
| Volume + Site | 2 | 834.5 | 14.7 | 0.51 | 0.64 |
| Volume + Site + Branch Width | 3 | 821.3 | 1.5 | 0.55 | 0.67 |
| Full (+ Condition) | 4 | 819.8 | 0 | 0.56 | 0.68 |
Reproducibility: Model comparison in
scripts/05_coral_cafi_relationships.R (lines 250–300).
All figures provide visual support for main text findings. Each figure caption explains what it shows, why it matters, and where to find the corresponding main text discussion.
Supports: Main text Methods → Study System (page 3); Figure 1A in main text
Figure S1. Study sites around Mo’orea, French Polynesia. Three sites span a gradient from sheltered lagoon (MAT, green) to exposed barrier reef (MRB, purple). HAU (fringing reef, blue) experiences intermediate wave exposure. Inset shows location in the South Pacific. Site selection ensures representation of major reef environments on Mo’orea. Why this matters: Environmental heterogeneity among sites tests whether cryptofauna assembly rules generalize or are context-dependent.
Reproducibility:
scripts/03_spatial_patterns.R (lines 20–80).
Supports: Main text Methods → Coral and Fauna Sampling; validates sampling design
Figure S2. Coral colony volume distributions by site. Box plots show median, interquartile range, and outliers. All sites span similar volume ranges (approximately 100–10,000 cm³), ensuring that site effects on community composition are not confounded by systematic size differences. Why this matters: If sites differed systematically in coral size, apparent site effects could actually reflect size effects.
Reproducibility:
scripts/05a_coral_characteristics.R (lines 50–80).
Supports: Main text Results → Result 4 (page 6); Supplementary Methods S1.4
Figure S3. Validation of position correction for physiological measurements. (A) Stump length (sampling position) increases with colony volume (r = 0.565, p < 0.001), creating a confound. (B) Position-corrected traits show no significant correlation with volume (|r| < 0.10). Why this matters: Without this correction, any diversity–condition analysis could be confounded by both variables correlating with colony size.
Reproducibility:
scripts/05a_coral_characteristics.R (lines 120–180).
Supports: Main text Results → Result 1 (page 5)
Figure S4. Alpha diversity metrics across sites. (A) Species richness per coral. (B) Shannon diversity (H’). (C) Simpson diversity (1-D). (D) Pielou’s evenness. MAT (lagoon) shows highest diversity across all metrics; MRB (barrier reef) shows lowest. Kruskal-Wallis tests indicate significant site differences for richness and Shannon (p < 0.01). Why this matters: Site differences in diversity support the PERMANOVA finding that reef environment structures cryptofauna communities.
Reproducibility:
scripts/04_diversity_analysis.R (lines 100–150).
Supports: Main text Methods → validates sampling completeness
Figure S5. Species rarefaction curves by site. Expected richness as a function of sampling effort (number of individuals). Shaded regions indicate 95% confidence intervals. Curves approach but do not fully reach asymptotes, suggesting some rare species remain unsampled. However, the curves are sufficiently similar to justify between-site comparisons. Why this matters: Rarefaction confirms that site differences in diversity are not artifacts of unequal sampling effort.
Reproducibility:
scripts/04_diversity_analysis.R (lines 160–200).
Supports: Main text Figure 2A; Results → Result 1 (page 5)
Figure S6. NMDS ordination of CAFI communities. Points represent individual corals, colored by site with 95% confidence ellipses. Environmental vectors show significant correlations with ordination axes (p < 0.05). Site separation confirms PERMANOVA finding that reef environment structures community composition (Table S2). Stress = 0.18 indicates adequate ordination quality. Why this matters: Visual confirmation that sites support compositionally distinct cryptofauna assemblages.
Reproducibility:
scripts/04_diversity_analysis.R (lines 220–280).
Supports: Main text Figure 2A; Results → Result 1 (page 5)
Figure S7. Fauna abundance scales sublinearly with coral volume. Log-log plot showing CAFI abundance vs. coral volume. Points colored by site; solid line shows GLMM prediction with 95% CI. Dashed line indicates isometric scaling (β = 1). The observed slope (β = 0.46) is significantly less than 1.0, indicating that larger corals harbor proportionally fewer fauna per unit volume—consistent with propagule dilution. Why this matters: This is the central result supporting the propagule redirection hypothesis.
Reproducibility:
scripts/05_coral_cafi_relationships.R (lines 80–150).
Supports: Main text Discussion → Propagule Dilution (page 7); Table S4
Figure S8. Site-specific scaling relationships. Separate log-log regressions for each site. All sites show consistent sublinear scaling (β < 1), though MAT (lagoon) shows the tightest relationship (R² = 0.62). Site-specific exponents range from 0.44 (MAT) to 0.52 (HAU), overlapping confidence intervals indicating no significant heterogeneity. Why this matters: Confirms that propagule dilution operates consistently across reef environments.
Reproducibility:
scripts/05_coral_cafi_relationships.R (lines 160–200).
Supports: Main text Figure 4; Results → Result 4 (page 6)
Figure S9. Species co-occurrence network. Nodes represent species (sized by abundance, colored by functional group). Edges connect species with significant positive associations (|ρ| > 0.3, p < 0.05). Network shows significant modularity (Q = 0.52 vs. null expectation 0.08, p < 0.0001). The snapping shrimp Alpheus diadema (highlighted) emerges as the most connected species. Why this matters: Demonstrates that cryptofauna form structured assemblages, not random collections of species.
Reproducibility:
scripts/06_network_analysis.R (lines 50–120).
Supports: Main text Table 1; Results → Result 3 (page 6)
Figure S10. Network module structure. (A) Network colored by module membership (Louvain algorithm). (B) Module composition by taxonomic group. (C) Degree distribution showing approximate scale-free properties. Six modules correspond to functionally coherent species groups (e.g., Module 1 = guardian crabs and associated shrimp; Module 2 = brittle stars). Why this matters: Modular structure suggests ecological organization—species within modules may share habitat requirements or facilitate each other’s presence.
Reproducibility:
scripts/06_network_analysis.R (lines 130–180).
Supports: Main text Results → Result 2: No Evidence for Diversity-Condition Relationship (page 6); Table S8
Figure S11. Raw vs. corrected diversity-condition relationships. Total CAFI abundance vs. coral condition score. While raw richness shows an apparent positive relationship when controlling only for volume (β = 0.058, p = 0.041), this effect is a sampling artifact: larger corals yield more individuals, inflating observed richness. When properly corrected (rarefaction, residualization, evenness), all diversity metrics show no significant relationship with condition (see Table S8b). Points colored by site. Why this matters: Demonstrates how species-area sampling artifacts can produce spurious diversity-function relationships.
Reproducibility:
scripts/18_cafi_predicts_condition.R and
scripts/richness_condition_diagnostic.R. See Table S8 for
complete diagnostic analysis.
Supports: Main text Methods → Fauna Sampling (page 3); provides community context
Figure S12. Taxonomic composition of CAFI communities. (A) Overall composition showing dominance of decapod crustaceans (crabs + shrimp = 78% of individuals). (B) Taxonomic composition varies by site: HAU has more diverse gastropod/polychaete fauna; MAT is dominated by trapezid crabs; MRB shows elevated shrimp diversity. Why this matters: Context for interpreting which functional groups drive community patterns.
Reproducibility:
scripts/02_community_composition.R (lines 50–100).
Supports: Main text Results → Result 2 (Sublinear Scaling); extends to compositional turnover
Figure S13. Community composition shifts with coral size. (A) Stacked bar chart showing proportional abundance of major taxonomic groups across coral volume bins. Crabs dominate small corals while fish and snails increase in larger corals. (B) Proportional abundance of key taxa vs. coral volume on a log scale. Crabs show strong negative correlation (r = -0.56), while fish (r = +0.32) and snails (r = +0.28) increase with size. Why this matters: Beyond abundance scaling (Result 2), composition also shifts—larger corals support different assemblages dominated by larger-bodied taxa (fish, snails) rather than small crabs.
Reproducibility:
scripts/generate_composition_size_figure.R. Full 4-panel
version:
output/figures/composition/composition_by_size.png.
Supports: Main text Results → Result 4 (Coral Condition); examines neighborhood effects on composition
Figure S14. Community composition varies with coral neighborhood. (A) Proportional composition by number of neighboring corals (within 5m radius). Corals in denser neighborhoods show slightly higher shrimp proportions. (B) Proportional abundance vs. neighbor count for key taxa. Shrimp increase modestly with more neighbors (r = 0.33), while snails decrease (r = -0.22). However, these patterns are weaker than size effects (Figure S13). Why this matters: Neighborhood context has minimal effect on composition compared to coral size, consistent with the independence of condition from neighborhood (Result 4).
Reproducibility:
scripts/generate_composition_neighborhood_figure.R. Full
4-panel version:
output/figures/composition/composition_by_neighborhood.png.
Supports: Main text Results → integrates size and neighborhood effects on composition
Figure S15. Distance-based Redundancy Analysis (db-RDA) of CAFI composition. (A) RDA biplot showing coral communities constrained by size and neighborhood variables. Points colored by coral volume; red arrows show environmental vectors. The log(Volume) vector is longest, indicating coral size is the dominant predictor. (B) Marginal significance of each predictor (Type III tests). Only coral size approaches significance (F = 1.64, p = 0.15); neighborhood metrics (# neighbors, mean distance, neighbor volume) contribute essentially no unique explanatory power. (C) Variance partitioning: coral size explains 1.8% unique variance; neighborhood metrics explain 0% unique (actually slightly negative after adjustment). (D–F) Correlations of RDA axis 1 with each predictor confirm that the primary compositional gradient aligns with coral size (r = 0.50, p < 0.001). Why this matters: Multivariate analysis confirms that coral size is the only meaningful predictor of community composition—neighborhood context adds no explanatory power.
Statistical Details: db-RDA Analysis
capscale() in vegan packageReproducibility:
scripts/generate_composition_rda_figure.R. Simplified
manuscript version:
output/figures/manuscript/Figure8_rda_composition.png.
Supports: Figures S13–S15; quantifies relative importance of size vs. neighborhood
| Predictor | F-statistic | p-value | Unique Variance | RDA1 Correlation | Interpretation |
|---|---|---|---|---|---|
| Coral Size (log vol) | 1.64 | 0.15 | 1.8% | r = 0.50* | Primary driver |
| # Neighbors | 0.66 | 0.61 | ~0% | r = -0.31* | Weak (confounded w/ size) |
| Mean Distance | 1.08 | 0.37 | 0.1% | r = 0.09 | Negligible |
| Neighbor Volume (log) | 0.88 | 0.47 | ~0% | r = -0.16 | Negligible |
Reproducibility: Results from
scripts/generate_composition_rda_figure.R.
Supports: Figures S13 and S15; identifies which taxa drive size-composition relationship
| Taxon | RDA1 Score | Univariate r | Size Association | Biological Interpretation |
|---|---|---|---|---|
| Snail | +0.71 | +0.28** | Large corals | Larger body size, space needs |
| Fish | +0.43 | +0.32** | Large corals | Territorial behavior |
| Hermit | +0.19 | +0.15 | Large corals | Shell availability in large corals |
| Crab | 0.00 | -0.56* | Small corals | Outcompeted by fish in large corals? |
| Shrimp | 0.00 | -0.12 | Neutral | Ubiquitous |
| Other | -0.01 | +0.08 | Neutral | Mixed taxa |
| Echinoderm | -0.27 | -0.18 | Small corals | Prefer cryptic small spaces? |
Reproducibility: Correlations from
scripts/generate_composition_size_figure.R; RDA scores from
scripts/generate_composition_rda_figure.R.
Supports: Main text Results → Result 2; Table 2 in main text; Figure 3B–D
This table provides the complete statistical output for all neighborhood metrics tested as predictors of CAFI abundance, controlling for coral volume. These results support the main text finding that coral size dominates and neighborhood context is negligible.
| Metric | Definition | β (Effect) | SE | p-value | Unique R² | Direction | Interpretation |
|---|---|---|---|---|---|---|---|
| Neighbor count | # corals within 5m radius | +0.003 | 0.002 | 0.057 | <0.1% | Weak positive | Marginal spillover effect; biologically trivial |
| Neighbor volume (log) | Total cm³ of neighboring corals | −0.074 | 0.018 | <0.001 | <0.5% | Negative | Confounded with coral size; large neighbors = large focal corals |
| Isolation index | Mean distance / coral size^(1/3) | +0.043 | 0.14 | 0.76 | ~0% | None | No propagule redirection at meter scales |
| Relative size (log) | Focal volume / mean neighbor volume | +0.020 | 0.005 | <0.001 | <0.5% | Positive | Competitive asymmetry; ‘big fish in small pond’ |
| Spillover potential (log) | Neighbor volume / mean distance | −0.079 | 0.018 | <0.001 | <0.5% | Negative | Opposite of facilitation prediction |
Statistical Details: Neighborhood Analysis
CAFI_abundance ~ neighborhood_metric + log(volume) (Poisson
GLM)Reproducibility: Analysis in
scripts/14_local_neighborhood_effects.R and
scripts/Fig6_comprehensive_neighborhood_effects.R. Results
saved to output/objects/H3_neighborhood_results.rds.
Supports: Main text Figure 2B–D; Table S13
Figure S16. Comprehensive local neighborhood effects on CAFI communities. Six-panel figure showing meter-scale spatial effects. (A) CAFI abundance vs number of neighboring corals—weak positive trend. (B) CAFI vs total neighbor volume—negative relationship (confounded with coral size). (C) CAFI vs isolation index—no significant effect. (D) CAFI vs relative size—modest positive effect for corals larger than neighbors. (E) CAFI vs spillover potential—negative relationship, opposite facilitation prediction. (F) Summary by neighbor density category. Key finding: After controlling for coral size, neighborhood metrics collectively explain <1% of variance in CAFI abundance. Coral size is the dominant predictor.
Reproducibility:
scripts/Fig6_comprehensive_neighborhood_effects.R.
This section presents the detailed analysis demonstrating that neither diversity nor composition robustly predicts coral physiological condition, supporting Key Finding 2 in the main text.
Supports: Main text Results → Result 2: No Evidence for Diversity-Condition Relationship
Figure S17. Community composition does not robustly predict coral condition. CAFI Community PC1 (x-axis) vs. Coral Condition Score PC1 (y-axis). Points colored by site. While the Pearson correlation appears significant (r = −0.29, p = 0.007), this result is driven entirely by 3 extreme observations. Robust correlation methods show no relationship: Spearman ρ = −0.06 (p = 0.61), Kendall τ = −0.03 (p = 0.67).
We tested whether community composition—independent of diversity—predicted coral condition. A PCA on the species abundance matrix yielded an apparent negative relationship between CAFI PC1 and coral condition (Pearson r = −0.29, p = 0.007). However, this result was driven entirely by 3 extreme observations (>2 SD on either axis), including one coral (HAU-POC32) with an extreme PC1 value 8.5 standard deviations below the mean.
When we applied robust correlation methods insensitive to outliers: - Spearman’s ρ = −0.06 (p = 0.61) - Kendall’s τ = −0.03 (p = 0.67) - Excluding 3 extreme points: R² = 0.004 (p = 0.57)
No individual species survived FDR correction for multiple testing (all q > 0.35). We therefore conclude that community composition, like diversity, shows no robust relationship with coral condition in this dataset.
Reproducibility: Analysis in
scripts/18_cafi_predicts_condition.R. Figure:
output/figures/cafi_predicts_condition/community_pc1_vs_condition_pc1.png.
This section provides a complete guide to reproducing all analyses and figures in the manuscript.
All data files are in the data/ directory:
| File | Description | Rows | Main Variables |
|---|---|---|---|
survey_cafi_data_w_taxonomy_summer2019_v5.csv |
Individual fauna records | 3,989 | coral_id, species, count |
survey_coral_characteristics_merged_v2.csv |
Coral colony data | 114 | coral_id, site, volume, GPS |
survey_master_phys_data_v3.csv |
Coral physiology | 108 | coral_id, protein, zoox_density |
Scripts in scripts/ are numbered for sequential
execution:
| Script | Purpose | Key Outputs | Main Text Section |
|---|---|---|---|
01_load_clean_data.R |
Data import and cleaning | Master dataset | Methods |
02_community_composition.R |
Taxonomic summaries | Species tables | Methods |
03_spatial_patterns.R |
Site maps | Figure S1 | Methods |
04_diversity_analysis.R |
PERMANOVA, NMDS, diversity | Tables S2, S9; Figures S4–S6 | Result 1 |
05_coral_cafi_relationships.R |
Scaling analyses | Tables S3–S4; Figures S7–S8 | Result 2 |
05a_coral_characteristics.R |
Position correction, condition | Table S7; Figure S3 | Result 1 |
06_network_analysis.R |
Co-occurrence networks | Tables S5–S6; Figures S9–S10 | Result 4 |
18_cafi_predicts_condition.R |
Richness–condition diagnostic | Table S8; Figure S11 | Result 2 |
# 1. Set working directory to project root
setwd("/path/to/CAFI-Survey-2026")
# 2. Run master script (executes all analyses)
source("scripts/run_all_survey_analyses.R")
# 3. Render documents
rmarkdown::render("output/manuscript/MANUSCRIPT.Rmd")
rmarkdown::render("output/manuscript/SUPPLEMENTARY_MATERIALS.Rmd")All outputs are in output/:
tables/ — CSV files of statistical resultsfigures/ — PNG files at 300 DPIobjects/ — RDS files of fitted modelsmanuscript/ — Rendered HTML documentsAll data and code are publicly available:
For questions about data or analysis, contact: Adrian Stier (astier@ucsb.edu)
Supplementary Materials for: Stier et al. “Sublinear scaling and modular network structure reveal assembly rules for coral-associated cryptofauna”